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ABSTRACT 


We  consider  the  average  cost  per  unit  time  problem  for  wide  bandwidth 
noise  driven  control  systems,  where  the  average  cost  is  in  the  pathwise  sense; 
no  expectations  are  used.  Let  t  =  time  of  control  and  BW  «  bandwidth.  For 
our  class  of  processes,  we  prove  various  uniformity  properties  for  the 
convergence  of  the  pathwise  average  costs  as  t  BW  «*  •  Let  u6(-)  be  a 
smooth  8-optimal  control  for  the  limit  controlled  diffusion  (the  limit  as  BW  - 
*°)  for  the  (mean)  average  cost  per  unit  time  problem.  We  show  that  for  large 
enough  t  and  BW,  u8(  )  is  26-optimal  (with  a  probability  arbitrarily  close  to 
1)  for  the  pathwise  wide  bandwidth  problem.  This  uniformity  is  important  in 
applications,  for  we  often  have  only  one  long  sequence  to  control,  and  the 
expectation  is  inappropriate.  Also,  otherwise,  as  BW  -»  •,  it  might  take  longer 
and  longer  to  well  approximate  the  limit  pathwise  average  cost.  Applications 
to  related  ‘pathwise  average’  problems  are  given:  the  convergence  of  the 
average  pathwise  errors  for  an  ‘approximate’  non-linear  filter  with  wide 
bandwidth  observation  and  system  driving  noise,  and  the  convergence  and 
accuracy  of  Monte  Carlo  calculations  of  Liapunov  exponents  for  wide 
bandwidth  noise  driven  systems  (as  BW  -•  ®)  via  average  cost/unit  time 
methods.  It  is  also  shown  for  the  discounted  cost  problem  that  the  optimum 
pathwise  costs  converge  to  the  minimum  average  cost  per  unit  time  as  both 
the  discount  factor  goes  to  zero,  and  BW  -*  •  . 

Key  words:  pathwise  average  cost  per  unit  time,  ergodic  control, 
approximations  of  ergodic  control,  wide  band  noise  driven  systems, 
approximate  non-linear  filtering,  Liapunov  exponents,  discounted  cost. 


K  I.1  i.l  |.t  ft*  #,*  «.»  «■♦  <■  4  i.l't  l«i  t»a  It.  1>.  t.f  « 


1.  Introduction 

A  Average  cost  per  unit  time  (over  an  infinite  time  horizon)  optimal  control 
problems  for  diffusion  and  other  Markov  models  have  been  dealt  with  in 
various  ways<Tis~mr-«vt7r4J4r42]T~-I3).  We  treat  such  a  problem  for  ‘wideband 
noise  driven’  and  related  systems,  which  are  ‘close*  to  a  diffusion,  and  when 
the  average  is  in  the  pathwise  but  not  necessarily  in  the  mean  value  sense. 
The  general  method  works  for  many  other  classes  of  processes  which  are 

suitably  approximated  by  an  appropriate  controlled  Markov  process.  As 

C  - ^ 

pointed  out  below  and  in  Sections  A  and  5,  the  results  have  applications  to 
many  other  problems  where  pathwise  averages  are  important,  and  the  noises 
are  ‘wide  band’.  E.g.,  in  Section  5,  we  treat  the  problem  where  both  BW  -*  “ 


and  discount  factor  -  0. 

Let  the  diffusion  model  be  given  in  the  relaxed  control  form  (1.1),  where 
b(-,  )  and  o(  ■)  arc  continuous  (other  conditions  will  be  listed  below)  and  mt(  )  is 
an  admissible  relaxed  control  [1],  [3],  [4],  over  a  compact  control  value  space  U. 
The  relaxed  control  might  be  of  the  feedback  form.  The  precise  definition  is  in 
the  Appendix.  We  note  here  that  mt(  •)  is  a  measure  over  the  Borel  sets  of  U. 


dx  =  J  b(x,a)mt(dot)dt  +  o(x)dw. 


In  [1],  relaxed  controls  were  used  to  get  nearly  optimal  controls  for  several 
‘wideband’  noise  driven  systems,  and  in  [3],  they  were  cleverly  used  to  get  an 
‘occupation  measure’  for  the  state-control  pair  which  ultimately  allowed  the 
authors  to  demonstrate  the  existence  of  an  optimal  stationary  control.  These 
advantages  also  occur  for  the  particular  problems  to  be  described  below.  In 
(1],  [2],  the  cost  of  concern  was  ([2]  did  not  use  relaxed  controls) 
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T 

( 1  *2)  'ini  —  [  Ek(x(t),a)mt(da)  H  7tm), 
for  a  bounded  continuous  k(). 

In  practice,  of  course,  one  does  not  have  a  process  which  is  a  diffusion, 
and  it  is  of  considerable  interest  to  consider  wide  bandwidth  noise  driven 
systems  of  the  form 


(1.3)  x€  =  J  b(x€,a)mt(do)  +  F£(x£,(£) 


where  (£()  is  the  wide  bandwidth  noise.  We  use  the  scaling  (£(t)  =  ((t/«) 
for  an  appropriate  ‘mixing’  process  ((•)  owing  to  its  convenience  in 
simplifying  the  details.  But  it  should  be  clear  that  the  method  is  of  fairly 
general  applicability.  Reference  [1]  dealt  with  a  system  of  type  (1.3)  (with 
weak  limit  of  type  (1.1))  and  cost  of  the  form  (1.2).  It  was  shown,  under  the 
conditions  there  that  for  any  6  >  0,  a  smooth  6-optimal  control  u6  for  (1.1), 
(1.2)  was  also  ‘nearly’  optimal  for  (1.3)  and  (1.4),  for  small  c. 


(L4)  Inn  ~  J  Ek(x£(t),a)m$(da)  =  7£(m) 


i.e.,  lim  7£(m£)  >  Urn  7£(u6)  -  6  for  any  sequence  m£. 


Such  results  are  helpful  in  justifying  the  use  of  the  ideal  limit  process  (1.1) 
for  use  in  control  theory. 

In  [3],  Borkar  and  Ghosh  showed  the  existence  of  an  optimal  feedback 
control  for  the  diffusion  model  (under  this  control  the  diffusion  could  be 
taken  to  be  stationary)  and  cost  function  (1.2),  but  with  the  E  deleted  —  a 
pathwise  result.  This  paper  is  devoted  to  a  related  problem  for  the  model 
(1.3).  Define 
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i  r 

(1.5)  7T(m)  =  —  k(x(s),a)mi(d«),  7(m)  =  lim  7T(m), 

1  j0  t 

T 

(1.6)  7j(ni)  =  ~r  [  k(x€(s),a)mt(da). 

*  J  n 


If  m()  is  equivalent  to  a  classical  control  function  u(-).  we  write  u  in  lieu  of 


m  in  7T(m),  etc.  The  ‘pathwise’  convergence  result  in  [3]  is  of  particular 


importance  in  applications,  since  one  often  has  a  single  long  realization,  and 


then  the  expectation  is  not  appropriate  in  the  cost  function.  The  results 


in  [3]  (under  their  conditions)  give  the  existence  of  a  feedback  relaxed  control 


m(  -)  such  that  7T(m)  -  7  =  inf  lim  7T  (m)  w.p.l. 


In  our  problem  here,  owing  to  the  wideband  noise  and  the  appearance  of  the 


two  parameters  (  and  T,  w.p.l  type  convergence  results  are  usually  either 


meaningless  or  impossible  to  obtain.  Typically,  in  an  application  one  has  a 


particular  process  with  a  given  wide  bandwidth  driving  force.  One  is  interested  in 


knowing  how  well  good  controls  for  the  ‘limit’  problem  do  on  the  actual  physical 


problem.  The  wide  bandwidth  driving  term  is  imbedded  into  a  sequence  for 


purposes  of  getting  an  approximation  result,  and  w.p.l  type  results  might  make 


little  sense. 


Let  u6(  )  denote  a  ‘nice’  B-optimal  classical  control  (‘nice’  is  defined  in 


the  next  section)  for  model  (1.1)  and  cost  function  (1.4).  Then  we  wish  to 


show  (1.8a)  and  (1.8b): 


(1.8a)  7^(u8)  7(u6),  as  «  -  0,  T  -  •  , 


(1.8b)  Mm  P{7x(m€)  >  7(u6)  -  6)  =  1 


for  any  sequence  of  admissible  relaxed  controls  m€().  Since  the  time 


t 


derivative  of  7j(m)  is  0(1/T)  uniformly  in  e,  m,  u,  the  convergence  is 
somewhat  stronger  than  indicated  by  (1.8).  Eqn.  (1.8b)  implies  a  type  of 
uniformity  of  convergence,  since  the  way  that  t  -  0  and  T  -»  •  is  not 
important.  Were  this  ‘uniformity’  not  the  case,  it  would  be  possible  that  as  e 
"•  0,  a  larger  and  larger  T  is  needed  in  order  to  closely  approximate  the  limit 
value.  In  that  case,  the  white  noise  limit  (1.1)  would  not  be  useful  for 
predictive  or  control  purposes,  when  the  true  model  is  (1.3). 

In  Section  2,  we  list  several  assumptions  and  prove  (1.8).  In  order  to  simplify 
the  development,  the  technique  of  perturbed  test  functions  from  [5]  is  used.  To 
facilitate  the  calculations,  some  of  the  conditions  will  be  adapted  from  those  used 
in  that  reference  --  but  many  useful  generalizations  should  be  clear.  In  Section 
3,  we  redevelop  the  result  of  Section  2,  using  a  ‘first  order  perturbed  test  function’ 
method,  with  less  smoothness  required  on  the  functions  and  less  mixing  required 
on  the  noise  but  more  details  required  in  the  proof.  Some  extensions  are 
discussed  in  Section  4.  The  ideas  of  ‘pathwise  uniform’  convergence  of  a  sample 
average  cost  per  unit  time  has  many  other  applications.  For  example  in  the 
Monte  Carlo  evaluation  of  Liapunov  exponents  with  wide  bandwidth  noise 
coefficients  for  linear  systems  [6J.  The  formula  for  these  exponents  is  of  the 
form  of  an  average  cost  per  unit  time.  For  this  problem,  it  is  shown  in  Section  4 
that  the  Monte  Carlo  evaluated  pathwise  average  cost  per  unit  time  converges  (as 
«  *•  0,  T  *•  •)  to  the  same  limit  that  one  would  obtain  were  the  actual  limit 
diffusion  used  for  the  evaluation.  The  limit  depends  only  on  the  correlation 
function  of  the  noise  (€(  •).  Such  a  result  is  essential  for  the  Monte  Carlo  method 
to  be  usef ul,  and  for  the  Liapunov  exponents  of  the  limit  system  to  be  meaningful 
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indicators  of  the  behavior  of  the  actual  (wide  bandwidth  noise  driven)  physical 
system. 

An  extension  to  a  problem  of  average  pathwise  error  per  unit  time  for  an 
‘approximate’  non-linear  filter  for  a  system  with  wide  bandwidth  driving  and 
observation  noise  is  also  discussed  in  Section  4. 

In  Section  5,  we  treat  extensions  to  the  discounted  cost  case.  Define  the 
pathwise  discounted  cost 


V|(m)  =  0  |  e'®*  |k(x€(s),o)m((da)ds. 


and  let  m€(  )  be  a  sequence  of  Bj-optimal  controls.  We  show  that 


(1.9a) 


V|(u5)  -  7(u8),  as  B  -  0,  €  -  0, 


(1.9b)  n^P{V|(m€)  *  7(u8)  -  8)  =  1. 

The  uniformity  result  is  important,  since  we  would  not  want  the  speed  with 
which  0  -  0  to  depend  on  the  bandwidth  -  in  order  to  get  the  proper 
approximation.  The  sense  in  which  m6(  )  is  Bj-optimal  is  left  purposely  vague 
-  since  (1.9)  holds  for  any  {m£(-)},  under  the  conditions  below.  Thus  for 
small  c,6,  u8(.)  is  always  nearly  optimal.  There  also  are  extensions  to 
impulsive  and  singular  control  problems. 
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2.  A  Basic  Convergence  Theorem 

For  convenience  in  this  section,  we  use  the  assumptions  of  [5,  Chapter  4.6], 
with  appropriate  modification  for  the  relaxed  controls.  The  system  (1.3)  will 
take  the  form 


x€  =  J  G(x€,a)mt(da)  +  G0(x£,(£)  +  F(x£,(£)/€. 


(2.1)  is  a  common  way  of  getting  a  wide  bandwidth  noise  driven  system 
[5,13,14].  Other  forms  for  F(x,()/«  can  be  used.  See,  e.g.,  the  examples  in  [5] 
where  the  use  of  perturbed  test  functions  for  weak  convergence  is  illustrated. 
We  use  either  bounded  noise  or  Gaussian  noise.  For  the  first  case  (A2.1)  - 
(A2.6)  arc  used.  The  second  case  is  covered  by  (A2.10).  Let  E£  denote  the 
expectation,  conditioned  on  (£(s),  s  <  t,  and  Et  the  expectation  conditioned  on 
t(s),  s  «  t. 


A2.1.  G(  •),  F(  • ,  • ),  G0(  • ,  • ),  Fx(  • ,  • ) 
continuous  in  x  for  each  (  and  is  bound 
and  EG0(x,()  =  EF(.x,()  =  0. 


re  continuous  in  (x,().  G0x(-,O  ii 
i.  l(  ■)  is  bounded,  right  continuous 


A2.2.  F^f  -.()  is  continuous 


A2.3.  Let  V(x,0 
for  compact  Q, 


either  «G0(x,0,  G  (x,0,  F(x,0  qt  Fv(x,().  Then 


c  sup  E£  V(x,$(s))ds  -  0 


x€Q  I  Jt/t2 


in  the  mean  square  sense,  uniformly  in  t. 


Let  Fj  denote  the  ith  component  of  F. 


1»J?UQ.VS  Fj(  • ),  a(  ■) 


I  EF.'iX(x,«s))F(x,«t))ds  -  Fj(x), 

[  EF.(x,«s))F.(x,«t))ds  -  au(x)/2, 
u 

as  t  -  and  the  convergence  is  uniform  in  anv  bounded  x-set. 


Define  a(x)  =  \  [a(x)  +  a'(x)]. 


A2.5.  For  each  compact  set  Q, 

»°°  p00 

sup  e  I  dr  ds[E£  F.'x(x,t(s))F(x,?(T)) 
xfcQ  Ut/£2  ij 

-  EFi'x(x,Us))F(x,UT))]  -  0 

dT  ds[E£F(x,Us))F '(x,((T))  -  EF(x,£(s))F '(x,£(T))] 
•'t/e4  *  T 


sup  e 

xCv^ 


0 


in  the  mean  square  sense  as  e  -  0,  uniformly  in  t.  Similarly,  when  the 
bracketed  terms  are  replaced  by  their  x-gradients. 


Remark.  (A2.4)  is  just  a  condition  on  the  rate  of  convergence  of  an 
expectation  to  a  ‘stationary’  value  as  t  -*  ®.  (A2.3)  and  (A2.5)  are  just 

conditions  on  the  rate  of  convergence  of  a  conditional  expectation  to  an 
expectation  as  the  ‘time  difference’  goes  to  infinity.  They  are  easily  shown 
to  be  satisfied  under  appropriate  mixing  conditions  on  ?(•)  [7,  Chapter  4], 
They  are  similar  to  conditions  used  in  [13,14]  for  weak  convergence  of  a 
sequence  of  Markov  processes. 

Define  b(.x,a)  =  G(x,a)  +  F(x)  and  the  operators  Am  (when  m  is  a  feedback 
relaxed  control  mx:  sec  the  Appendix  for  the  definition)  and  Aa  and  Au  as 


follows: 
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Aaf(x)  =  P(x)b(x,a)  +  ^  £  au(x)f  (x), 

2  i.j  ,J  xixj 

Amf(x)  =  J  A°f(x)mx(do), 

and  for  Au,  wc  replace  the  a  in  the  definition  of  Aa  by  the  classical  control 
function  u(  ). 

A2.6.  The  martingale  problem  for  operator  Am  has  a  unioue  solution  for 
each  relaxed  admissible  feedback  control  mx(-),  and  each  initial  condition. 
The  process  is  a  Feller  process.  The  solution  of  (2.1)  is  unique  in  the  weak 
sense  for  each  t  >  0. 


m 


K? 


LA 

■V 


I.-. 


m 


Remark.  The  uniqueness  and  existence  is  guaranteed  if  the  operator  Ar 
is  that  for  the  system 


(2.2) 


dx  =  b(x)dt  + 


"J  b(x,a)mx(do)dt' 

4. 

'  o(x)dw  ' 

0 

T 

.  o  . 

where  oo'  >  61  for  all  x  and  some  6  >  0,  b(  )  and  o(-)  are  Lipschitz 

A 

continuous  and  b(-,  )  is  merely  bounded  and  Borel  measurable  and  the 

A 

dimensions  of  (the  vector)  b  and  (square  matrix)  oo'  arc  equal. 

Let  M  denote  the  space  of  probability  measures  on  the  Borel  sets  of  Rr  x 
U,  with  the  ‘weak  compact’  topology  where  Pn  -  P  iff  Jf(x,a)Pn(dxda)  -* 
Jf(x,a)P(dxdoc)  for  each  continuous  function  f(  )  with  compact  support.  For 
an  admissible  relaxed  control  for  (2.1)  and  (1.1),  resp.,  define  the  (occupation) 
measure  valued  random  variables  Pj,€(-)  and  P^(-)  by,  resp., 


.  »'*v  ».vr.v»v*vAi 


irwiB'v\^ 


T 

P™  €(B  x  C)  =  —  f  I  e  m  (C)di, 

T  T  JQ  {*£(i)€B}  ‘V 

P?(BxC)  =  L(  ^xWeeA^dt. 

1  ■'o 

We  sometimes  write  m£(),  if  the  model  is  (2.1).  If  the  relaxed  control  for 
(1.1)  is  of  the  feedback  form  (mx  or  u(x))  ,  then  we  use  the  modification 

p?(B)-M  I{x(t)eB^t 
1 

(or  with  u  replacing  m),  and  similarly  define  P™,£(B),  P!J,'£(B)  for  feedback 
m(  -)  and  u(  -). 

Let  m£(  )  be  5j-optimal  (in  any  sense)  and  let  u6(  )  be  defined  by  (A2.8). 


A2.7.  The  set  of  random  variables  (x£(t),  e  >  0,  t  <  ®)  is  tieht. 


Remark.  The  tightness  in  (A2.7)  implies  the  tightness  of  the  set  of  K 
valued  random  variables  {P!J?,€(  ),  €  >  0,  T  <  •  uB  or  above  m£(  )}.  Under  a 
stability  condition  on  the  limit  equation  (1.1)  in  the  absence  of  control,  and 
some  other  conditions,  the  tightness  can  be  proved  by  a  ‘perturbed  Liapunov 
function’  method  [5].  Of  course,  if  the  state  space  is  compact,  as  for  the 
‘Liapunov  exponent’  problem  in  Section  4,  then  (A2.7)  always  holds.  In  lieu 
of  a  ‘universal  stability  condition’,  a  condition  on  the  minimum  (over  the 
control  values)  magnitude  of  the  cost  k(-)  as  |x|  -  ®  was  used  in  [3]  (for  the 
model  (1.1))  to  get  that  an  optimal  control  for  that  model  is  ‘stabilizing’. 
Perhaps  a  similar  idea  can  be  used  here.  But  this  point  won’t  be  pursued. 


6  >  0, 


(1.1)  and 


(1.2),  for  whicl 


tion  for 
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condition.  The  solution  is  a  Feller  process  and  there  is  a  unique  invariant 
measure  ji(u6,  •).  [u5  is  8-optimal  in  the  sense  that  7(u6)  <  7(mx)  +  6  For  the 

Stationary  initial  condition  for  anv  feedback  relaxed  control  mx  for  which 
there  is  a  stationary  solution  to  the  associated  martingale  problem.! 

A.2.9.  k(  ■)  is  bounded  and  continuous. 

Remark.  The  existence  of  such  smooth  8-optimal  controls  is  dealt  with  in 
[7],  It  will  exist  under  an  appropriate  stability  condition  on  the  uncontrolled 
(1.1),  and  either  non-degeneracy  of  (1.1)  or  for  a  system  of  the  form  (2.2)  [7], 
It  turns  out  that  7(u?)  =  7(u?)  w.p.l  (this  follows  from  the  method  of  proof  of 
Theorem  1  below,  or  from  the  method  in  [3],  under  the  conditions  there). 

A2.10.  (Gaussian  case).  {(■)  is  a  stable  Gauss-Markov  process  with  a 
stationary  transition  function  and  let  F(x,0  =  F(x)t,  G0(x,O  *  G0(xK,  where 
G,  G0,  and  F  satisfy  the  (in  x)  smoothness  in  (A2.1)  -  (A2.2).  Define  F(  )  and 
a()  as  _in.  (A2.4).  (Note:  the  other  parts  of  (A2.3)  -  (A2.5))  all  hold.] 

Theorem  1.  Assume  either  (A2.1)  la  (A2.9)  fir  (A2.6)  la  (A2.10).  Then 
(1.8a)  and  (1.8b)  hold. 

Proof.  We  do  the  ‘Gaussian’  case  only.  The  other  case  is  treated  in 
essentially  the  same  way.  Let  T  be  a  (countable)  measure  determing  set  of 
bounded  continuous  functions  which  have  continuous  second  partial  derivation, 
and  are  constant  for  large  |x|.  Let  m*(  •)  be  the  relaxed  control  in  (A2.7).  Define 
the  test  function  perturbations  (the  change  of  scale  r/r*  -*  T  yielding  the  right 
sides  of  the  equations  below  will  be  used  frequently  and  often  without  specific 
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mcntion) 


CD  CD 

f«(x,t)=  f  Efrj(x)G0(Mf(T))dT  =  €t  f  Ef  f3;(x)G0(x,UT))dT  =  O(€2)K£(t)!, 

Jt/£2 

QO  00 

f‘(x,t)=  f  Ef  fj|(x)F(x,i€(T))dT/£  =  £  f  Ef  ff(x)F(x,5(T))dT  =  0(£)K€(t)|, 

Jt /£* 

CD  00 

f j(x,t)  «  ]-j  f  dT f  ds{Ef[f'(x)F(x,^(s))]3;F(x^€(T)) 

«  Jt  JT 


-  E[fj;(x)F(x) {‘(s))]^  F(x,i€(T))} 


00  CD 

f  dT  f  ds{Ef[f3|(x)F(x,Us))]T(x,UT)) 
Jt/f2  *  T 


-  E[fJ|(x)F(x,l(s))li;F(x,l(T))}  =  0(e2)[K£(t)|2  +  1). 


The  K€(t)|  terms  come  from  the  Gauss-Markov  property. 


Define 


f€(t)  =  f(x£(t))  +  E  f£(x£(t),t). 

i=o  1 

The  operator  Am€,£  and  its  domain  D(Am£,£)  is  defined  in  the  Appendix.  By 
a  direct  calculation,  using  the  correlation  and  conditional  expectation 
properties  of  the  Gauss-Markov  process  t(-).  we  get  that  f (x £(  - ))  and  the 


ff(x£( •)»•)  are  all  in  IKA"1’*),  and 


Am  >£f(x£(t))  =  f '(x£(t))x£(t) 


Am  *£f£(x£(t),t)  -  -fj;(x£(t))G0(x€(t),{£{t)) 


}  lEfr; 

'’t 


:(x£(t))G0(x€(t),t£(s))]j;x£(t)  ds/£ 


w 
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Am  >£f[(x£(t),t)  =  ~fj|(x£(t))F(xf (t),4€(t))/€ 


+  {  ds  [E£  f3|(x£(t))F(x£(t),i<(s))])[x£(t)/ e, 

^  t 


and  similarly  for  Am  ,£fj(x£(t),t).  See  the  very  similar  calculation  in  [7, 
Chapter  4]  or  in  [15]  where  the  dynamical  terms  depend  smoothly  on  x,  and 
are  right  continuous  in  t. 

We  have 


(2.3a)  |f(x€(t»  -  f£(t)|  =  0(e)[|l€(t)|J  +  1]. 


By  adding  the  Am€,£f£(t)  to  Am£,£f(x£(t)),  subtracting  from  Am€f(x£(t))  and 
cancelling  terms  where  possible  we  get 


(2.3b)  lAm€-£f£(t)  -  Am£f(x£(t))|  =  0(€)[U£(t)ls  +  1], 


All  the  0(«)  are  uniform  in  t,  €,  u.  By  equation  4  of  the  Appendix  (with  our 
f£  replacing  the  q  there),  the  function 


(2.4)  M£(t)  «  f£(t)  -  f£(0)  -  Am  ,£f£(s)ds 

is  a  zero  mean  martingale.  We  next  show  that  M£(t)/t  0  as  t  -*  •  and  t  -  0 

in  any  way  at  all. 

Write  (where  [t]  denotes  the  greatest  integer  part  of  t) 

M£^t)  1  1  W"1 

(2.5)  =  ”[(M{  (t)  -  Mf£([t]))  +  Mf£(0)]  +  -  E  [M£(n+1)  -  M£(n)]. 

t  t  t  n=0 

Using  the  fact  that  f(  )  is  bounded  and  (2.3),  (2.5)  and  the  martingale 
property  of  M£(  ),  we  get  that  E[M£(t)/t]5  *  0(l)/t.  The  fact  that  M£/t, 
f£(t)/t  and  f€(0)/t  all  go  to  zero  in  probability  as  t  -  •  (uniformly  in  t) 


together  with  (2.4)  and  the  second  line  of  (2.3)  implies  that  as  t  -*  ®  and  e  -•  0, 

£  P 

(2.6a)  Am  f(x€(s))ds/t  •*  0. 

By  the  definition  of  P™  ,e(),  (2.6a)  can  be  written  as 

(2.6b)  /  Aaf(x)P^€'€(dxda)  -  0,  as  T  -  •  and  £  -  0. 

Now,  let  the  control  be  the  classical  control  function  u6(),  and  choose  a 

6  , 

weakly  convergent  subsequence  of  the  set  of  random  variables  {P^.  •*(•),  €,  T) 
(and  also  that  A  J*AU  f(x£(s))ds  -•  0  w.p.l  for  all  f(  )  €  F),  indexed  by  cn,  Tn, 
and  with  (random)  limit  denoted  by  JZ(  ).  We  let  the  limits  ]Z(  ■)  be  defined  on 
some  probability  space  (ft,  P,  F)  with  generic  variable  w.  Now,  (2.6b)  implies 
that 

(2.7)  J  Au6f(x)5(dx)  =  0,  P-almost  all  w. 

Since  our  class  cf  f(-)  is  measure  determining,  (2.7)  implies  that  almost  all 
realizations  of  S(-)  arc  invariant  measures  for  (1.1)  (under  u8).  [This  is 
proved  by  a  slight  extension  of  Prop.  9.2  of  [8].]  By  uniqueness  of  the 
invariant  measure,  we  can  take  n(u8,  •)  =  jZ(-)  for  all  ui,  and  the  limit  ji(-)  does 
not  depend  on  the  chosen  subsequence  £n,  Tn.  Furthermore,  by  the  definition 
of  P"8  >€(  ). 

ft  ^  g 

j  k(x€(s),u8(x€(s))ds/t  =  k(x,u8(x))P“  ,€(dx) 

^0  ‘*0 
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Next,  choose  a  weakly  convergent  subsequence  of  {P!J?  ,€(),«,T)  (and  also 
such  that  (2.6a)  -  0  w.p.l  for  all  f(-)  €  F)  indexed  by  en,  Tn,  and  with  limit 
denoted  by  P(  )  (again,  defined  on  some  probability  space  (n,  P,  F)).  For  each 
u,  we  can  factor  P(  )  as  P(dxda)  =  mx(da)ti(dx).  We  can  suppose  that  the  mx(B) 
are  x-rr.casurablc  for  each  Borel  B  and  u>. 

By  (2.6),  for  all  f(  )  €  F, 

(2.8)  J  Aaf(x)mx(da)pt(dx)  =  0  for  P-almost  all  u  . 

This  implies  that  (for  a.a.O),  <t(  )  is  an  invariant  measure  for  the  process  (1.1) 
with  relaxed  feedback  control  mx().  As  above  we  also  have 

(2.9)  f  k(x,a)m  (da)ji(dx)  =  lim  7xn(m<)  =  7(mx). 


But,  by  the  6-optimality  of  uB(-)t  for  almost  all  to  we  have  7(mx)  '*  7(u  )  -  6. 
Since  this  is  true  for  all  the  limits  of  the  tight  set  {P™  ,f();e,T),  (1.8b) 
follows.  Q.E.D. 
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3.  Alternative  Conditions 

In  this  section  we  redo  Theorem  1  under  somewhat  different  conditions. 
The  perturbed  test  function  is  only  ‘first  order’  here  and  (2.3)  won’t  hold.  But 
similar  results  are  obtained  via  a  direct  averaging  method  of  the  type 
introduced  in  (5,  Chapter  5].  We  will  use  either  bounded  ‘mixing’  or  Gaussian 
noise,  as  in  Section  2,  and  subsets  of  the  following  conditions.  Let  Et  denote 
the  expectation  given  {(s),  s  <  t. 


A3.1.  l(-)is 


tinuous  G0(  •, •)»  G(-,  ),  F(  t  ),  Fx(-,) 


ire  continuous. 


-  r 


E  F(x.((s))ds, 


Et[f^x)F(x,*(s))]T(x,l(t))ds 


ire  bounded  and  x-contim 


in.  t,  u. 


1  ft+T 


U S))ds  -  0, 


for  each  x  as  t  and  T  - 


tinuous  F(  •),  a(  •) 


•  ith  AQ  give 


we  have 


A0f(x)  *=  f^F(x)  +  -  Z 

2  »,j  J  i  j 

|  +Tds  |  du  Et[f,;(x)F(x,t(u))]T(x,<(s))  -  AQf(x), 


for  each  x  as  t  and  T  -  •  . 


PM 


I  *«%  *.«  M  i.t  «t{ 


'*«•  *»•  M  ‘M1  'M ‘<4* Mf >ii'4 jHj1  I J *>>>*<  .iM  ', 


A3. 5.  {(•)  is  a  stable  Gauss-Markov  process,  with  a  stationary  transition 

function,  and  F(x,0  =  F(xH,  G0(x,O  =  G0(x)t,  and  F(),  G(-,)  and  G0()  have 
the  smoothness  of  (A3.1).  [We  continue  to  define  F(-),  a(-)  and  A0  as  in 
(A3. 4),  when  (A3.5)  is  used.] 


As  in  Section  1,  set  A“f(x)  =  f '(x)G(x,a)  +  A0f(x),  and  b(x,a)  =  G(x,a)  + 


Theorem  2.  Assume  (A2.6)  la  (A2.9)  and  either  (A3.1)  lo  (A3.4)  or  else 
(A3. 5).  Then  (1.8a)  and  (1.8b)  hold. 


Proof.  Let  f(  )  be  as  in  Theorem  1.  We  use  the  ‘direct  averaging  first 
order  perturbed  test  function  method’  of  [5,  Chapter  5],  [9],  [1],  but  the 
development  here  is  self  contained.  Define  f£(x,t)  as  in  Theorem  1  and  set 
f£(0  =  f(x€(t))  +  f£(x£(t),t).  Then,  (write  x  for  x£(t)  for  convenience  here) 
f£(.)  €  D(Am€'€)  and 

Am£-£f£(t)  =  fJJ(x)[f  G(x,a)mf(da)  +  G0(x,t£(t))] 


M  dstEf^ 
4 1 


j;(x)F(x,r(S))]J;  F(x,t£(t)) 


+  terms  of  order  0(f)[j{£(t)|J  +  1J. 

(See  the  expressions  given  above  (2.3).)  Using  the  scale  change  s/e 2  -  s,  the 
second  term  can  be  seen  to  be  bounded  in  mean  square  for  the  bounded  noise 
case  and  0(l)[|(£(t)|J  +  1]  in  the  Gaussian  case. 

Define  the  martingale 
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ft  , 

M£(t)  =  f€(t)  -  f£(0)  -  Am  ,£f€(s)ds. 

If 

(3.1)  M£(t)/t  So  as  €  -*  0,  t  -  ®  , 
then  as  in  Theorem  1,  we  have 

f  Am€,€f€(s)ds/t  ^0,  as  c  -*  0,  t  -  *  . 

If  we  also  have  that 

(3.2)  -  f  [Am€-efe(s)  -  Am£f(x€(s))]ds  So,  as  €  -*  0,  t  ■*  ®  , 
t  J0 

(and  also  for  u6  used  in  lieu  of  m£())  then  the  proof  can  be  completed  as  in 
Theorem  1.  Thus,  we  need  only  shew  (3.1)  and  (3.2). 

To  get  (3.1),  we  use  the  representation  (2.5).  The  martingale  difference 
M£(n+1)  -  M£(n)  equals 


(3.3) 


^n+1  r  _ 

f€(n  +  l)  -  f £(n)  -  j  ds|^P(x€(s))J  G(x£(s),«)m£(d«)  +  G0(x£(s),(£(s)) 


»n+l 

'  n 


dsO(l)[|<£(s)|2  +  1). 


Since  the  mean  square  value  of  (3.3)  is  bounded  uniformly  in  n,  u,  e,  we  get 
that  E[M£(t)]J/t  =  0(l/t)  and  (3,1)  holds,  exactly  as  for  Theorem  1. 

We  now  prove  (3.2).  To  simplify  the  proof,  we  drop  the  terms 
jG(x,a)m£(da)  and  G0(x,O-  The  first  dropped  term  causes  no  problems  (as  in 
Theorem  I)  and  the  second  is  dealt  with  by  an  averaging  method  similar  to 
that  employed  below.  Now,  we  have 


i  r‘  r" 

-  ds  du  E.e[fj;(xc(s))F(xe(s),i€(u))]j;F(x,lc(s))/c* 
1  Jo  J* 


(3.4)  +  negligable  terms 

£2  ft/f2  f" 

-  —  ds  du  E1[r(x€(€2s)F(xe(€Js),5(u))]3|F(x4(€2s),Us)) 
1  Jo 


+  negligable  terms. 


where  the  ncgligcable  terms  go  to  zero  in  the  mean  square  sense  as  £  -  0. 
Henceforth,  for  simplicity,  we  consider  the  scalar  case  and  work  with  only  the 
term  fxx(x)F(x,?(u))F(x,i(s))  in  (3.4).  Write  t  *  NA  for  integer  N  and  A  >  0. 
Define 

Q€(x,s)  -  |  du  Eifxx(x)F(x,Uu))F(x,Us)). 

Then  the  desired  term  in  (3.4)  can  be  written  as 
,  N  £J  f(iA+A)/€2 

—  E—  J  a  ds  [E^Q€(xe(£2s),s)  -  Q€(x£(e2s),s)] 

(3-5)  ,  N  t2  f(iA+A)/e2 

+  N  FZ"JiA/€J  ^Q€(^(‘2s).s)ds. 

Since  E|Ei£AQe(x€(£2s),s)  -  Q€(x£(£2s),s)|2  is  bounded  uniformly  in  s,  £  and 
A,  the  first  set  of  summands  in  (3.5)  are  martingale  differences  with 
uniformly  (in  £,  N,  t)  bounded  mean  square  values.  Thus  the  first  sum  is 
0(1/N)  and  goes  to  zero  in  probability  as  N  -•  ®,  uniformly  in  €,  t.  By  [5, 

Chapter  3,  Theorem  4,  Part  1],  and  the  uniform  integrability  of  (Am<  E>£f€(0,  c 
>  0,  t  <  •},  the  sequence  (x£(iA+  )  -  x£(iA),  i,  A  >  0,  <  >  0)  is  tight  in  D[0,«>) 


♦i.O*  Jl'  .‘f 


Lir*3r?ar^Tr"L  jmjr*  jt-jt"  ww  k^.  i/t*  ir^  ir*  jr-. 


(Skorohod  topology).  Because  of  this,  we  can  replace  the  x£(f2s)  in  the  ith 
summand  of  the  second  term  in  (3.5)  by  x£(iA)  for  all  i,  and  only  alter  the  sum  by 
an  amount  which  goes  to  zero  in  probability  (uniformly  in  «  and  N)  as  A  -»  0. 

Doing  this  replacement  and  using  either  the  Gaussian  property  (A3. 5)  or 
else  (A3. 4)  for  the  bounded  noise  case,  and  the  continuity  of  F(-,0  (uniform 
in  (  in  the  bounded  noise  case)  and  the  continuity  and  compact  support  of 
f  (-)  yields  that  the  second  sum  in  (3.5)  and 
j  N  £J  ,(iA+A)/€2 


I  ix  £z 

—  I  —  ds  f  (x£(iA))a(xf(iA)) 

N  i  ^  JiA/,2 


have  the  same  limit  in  probability  as  N  -•  •»  ,  A  -*  0,  (  -  0,  NA  -*  «*.  We  next 
use  the  tightness  of  {x£(iA  +  •)  -  x£(iA),  i,  A  >  0,  e  >  0)  again  to  replace  the 
x£(iA)  in  (3.6)  by  x£(t2s),  and  get  the  same  result;  namely  that  the  limit  in 
probability  is  the  same  as  N  -  “  A  -  0,  €  -  0,  NA  -»  Finally,  repeating  the 

procedure  approximation  procedure  used  from  (3.5)  on  for  the  various 
neglected  terms  yields  (3.2).  Q.E.D. 
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4.  Extensions 

Discrete  time  problem.  There  are  direct  extensions  to  the  discrete 
parameter  model 

<4  I>  Xl+.  ■  ‘  /  G(X;,a)mn(da)  ♦  .C^.O  +  vT  F(Xj;.{<). 

In  both  (4.1)  and  (2.1),  we  can  allow  some  ‘state  dependence’  of  the  noise  -- 
(cf,  the  ‘Markov’  dependent  type  used  in  [5,  Chapters  4.4  or  5.5].) 

Approximate  non-linear  filtering.  In  the  following  two  applications,  there  is 
no  control.  In  Section  7  of  [10],  an  ‘approximate’  non-linear  filtering  problem 
was  dealt  with,  where  the  system  driving  and  observation  noises  were  wideband. 
It  was  shown  (under  a  condition  concerning  the  uniqueness  of  a  certain  invariant 
measure)  that  the  average  error  (using  the  notation  of  that  paper) 


1  f1 

hm  — 

£  ^  Jo 


EWx^t))  -  (P£(t),*)]2dt 


converged  to  what  one  would  get  if  the  true  optimal  filter  were  used  on  the 
‘limit’  process.  Here  x€(  )  is  the  state  of  the  ‘signal  system’  (say,  of  the  form 
(2.1)),  $(■)  is  bounded  and  continuous,  and  P€()  is  the  measure  valued  output 
(not  necessarily  the  conditional  distribution)  of  the  ‘approximate’  filters  used 
in  [12].  Via  the  technique  of  this  paper,  similar  results  can  be  obtained  if  the 
E  in  (4.1)  were  dropped.  This  is  useful,  since  we  would  normally  filter  only 
one  path  —  over  a  long  time  --  and  the  use  of  the  expectation  might  give  an 
inappropriate  measure  of  the  filter  performance. 


s.  The  theory 


of  Liapunov  exponents  is  well  developed  for  systems  of  the  form 


ji 

I* 

.* 

if 
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dx  =  Ax  dt  +  I  B  x  o  dw. , 
i=l  1 


where  the  ‘o’  denotes  that  the  stochastic  integral  is  in  the  ‘Stratonovich’  sense 
and  where  the  w;(  )  are  real  valued  and  mutually  independent  standard 
Wiener  processes  [11].  The  ‘Stratonovich’  sense  integral  is  used  to  be 
consistent  with  the  usage  in  [11]  and  because  it  simplifies  the  identification  of 
the  limit  process  and  its  ‘projection’  below  in  this  case.  Of  practical  interest 
are  the  convergence  properties  of  numerical  methods  of  evaluating  these 
exponents,  as  well  as  the  study  of  the  asymptotic  behavior  of  wideband  noise 
driven  svstcrr.s 


(4.3)  x€  =  Ax£  +  I  B;  xlf, 

i=i  1  1 

via  the  method  of  Liapunov  exponents.  In  (4.3),  the  tf(  )  are  orthogonal  and 
scalar  valued  processes.  Of  particular  interest  is  whether  the  exponents  for 
(4.3)  converge  to  those  for  the  limit  system  (which  will  be  of  the  general  form 

of  (4.2))  as  i  -  0. 

Under  the  conditions  of  Theorem  2  on  ?.*(•)  =  (( •/«*),  the  above 
orthogonality  condition,  and  the  normalization 


1  ft+T  f"  I 

—  j  ds  J  E,(i(s)(i(u)du  -  - 


in  probability  as  t  and  T  go  to  •,  the  x()  of  (4.2)  is  the  weak  limit  of  (4.3),  if 
the  initial  conditions  converge.  We  can  assume  this  normalization  to  hold  in 
general,  since  otherwise  we  absorb  the  ‘constants’  into  the  B;  in  the  obvious 


Define  y  =  x/|xj.  Then 
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y  =  x/|x|  -  x[x’x]/|x| 


>  vt/lvl3/2 


and 

(4.4) 


y«  =  Ay€  +  E^  Biy€^i£-  y[y‘Ay]  ~  y  |y '  E^ 


Assume  the  noise  conditions  of  Theorem  2.  Then,  it  is  not  hard  to  show  that 
P{x€(s)  *  0,  any  s  <  T)  =  1  for  all  e,  T. 

Of  interest  is  the  calculation  of  quantities  such  as  lim  Ejgq(y€(s))ds/t  for 


bounded  and  continuous  q(-).  In  the  Monte  Carlo  evaluation  of  the  limit,  one 
often  uses, 

(4.5) 


-  f  q(y€(s))ds 

t  Jft 


for  large  t  and  some  small  €,  and  it  is  of  interest  to  know  whether  or  not  the 
convergence  is  to  the  correct  limit  and  whether  it  is  uniform  in  6  and  t  in  the 
sense  of  (1.8a).  [An  alternative  is  of  course  to  fix  T  <  ®  and  approximate 
Ejjq(y€(s))ds/T  for  small  €  by  taking  many  independent  runs  and  averaging. 
But,  the  ‘uniformity’  questions  still  arise.] 

Define  y(t)  =  x(t)/|x(t)|  and 


Q(y)  =  y'Ay  +  -  Zi  [y '(Bj  +  B^y  -(y'Bjy)2], 


and  assume  that  y(  )  has  a  unique  invariant  measure  on  the  sphere  (this  is 
true  under  a  Lie  algebraic  condition  on  the  set  (A,  Bj,  i  (  k)  [11]).  Then  [11] 
the  (maximal)  Liapunov  exponent  is  the  limit  (which  is  a  constant  w.p.l) 


(4.6) 


lim  f 


lim  q(y(s))ds/t. 
’o 


vw 


rvw; 
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One  is  interested  in  whether  (4.5)  converges  to  (4.6)  as  c  -*  0  and  t  -•  “. 

By  Theorem  2  (x€(  ),y€(  •))  ^  (x(),y())  (Skorohod  topology),  and  the  weak 
limit  process  y(  )  is  characterized  completely  by  the  correlation  functions  of 
the  $.(•)•  Let  m(  )  denote  the  assumed  unique  invariant  measure  for  y(). 
Then 

i  p 

(4.7)  -  q(v€(s))ds  -  J  q(y)jt(dy),  as  €  -  0,  t  -  •  , 

1  •’o 

and  the  limit  value  is  just  the  (maximum)  Liapunov  exponent  for  x(  ).  The 
general  method  is  applicable  to  a  wide  variety  of  noise  processes  and  can 
readily  be  extended  to  yield  convergence  of  various  numerical  approximations 
to  the  (maximal)  Liapunov  exponent  for  (4.2),  via  use  of  either  a  discrete 
time  approximation  to  (4.2)  or  the  various  interpolations  which  can  be  used  to 
approximate  the  stochastic  integrals. 


Appendix 


Definition.  Let  U  be  a  compact  set  in  some  Euclidean  space.  Let  the 
w()  in  (1.1)  be  a  Wiener  process  with  respect  to  a  filtration  (Ft).  A  measure 
valued  (a  measure  on  the  Borel  sets  of  U  k  [0,®))  random  variable  m(-)  is  an 
admissible  relaxed  control  if  JJ‘f(s,a)m(dsda)  is  progressively  measurable  for 
each  bounded  and  continuous  f(-)  and  m([0,t]  *  U)  =  t.  If  m(  )  is  admissible, 
then  there  is  a  derivative  m(-)  (defined  for  almost  all  t)  which  is  non- 
anticipativc  and 


f(s,a)m(dsda)  =  j  ds  j  f(s,a)ms(dcc) 


for  all  t  w.p.l.  Sometimes  we  use  the  ‘feedback’  relaxed  control  (which  we 
write  as  mx(-))  which  is  a  measure  on  the  Borel  sets  of  U  for  each  x  and 
mx(B)  is  Borcl-measurable  for  each  Borel  B.  The  m((  )  and  mx(-)  will  also  be 
referred  to  as  relaxed  controls. 

An  admissible  relaxed  contol  m()  for  (2.1)  is  also  a  measure  valued 
random  variable  (as  above)  but  Jgf(s,a)m(dsda)  is  progressively  measurable 
with  respect  to  {F£},  where  F£  is  the  minimal  o-algebra  measuring  {$£(s),  x£(s), 
s  S  t).  Also,  we  impose  m([0,t]  x  U)  =  t.  As  above,  there  is  also  a  derivative 
mt(  •  ),  where  the  mt(B)  are  F£  measurable  for  Borel  B.  We  sometimes  use  the 
symbol  m£()  or  m£()  for  the  relaxed  controls,  when  (2.1)  is  used. 


Definition.  Let  q(  )  be  progressively  measurable  with  respect  to  (F£). 
Suppose  that  there  is  a  progressively  measurable  (with  respect  to  {F£})  g(  ) 
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such  that 

(1) 

sup  E|g(t)j  <  ®,  Ejg(t+s) 

t«T 

-  g(t)|  -  1 

(2) 

sup  E 
tST 

6>o 

E£q(t+6)  -  q(t) 

-  g(t)J  < 

6 

(3) 

lim  E 
6io 

Efq(t+6)  -  q(t) 

6 

-  g(t)|  - 

as  s  i  0,  each  t. 


each  t. 


Then  we  say  that  q()  €  D(Am'£),  the  domain  of  the  operator  Am,£  and 
that  Am,tq  =  g.  If  q(  )  €  C(Am'£),  then  {3,  Chapter  3],  [12], 

(4)  q(t.)  -  f  Am,eq(s)ds 

^  o 

is  a  martingale.  This  martingale  property  will  be  heavily  used  in  the  proofs. 
We  define  Aa£  to  be  Am,£  with  mt  concentrated  at  o,  and  Au,£  is  defined  in 
the  obvious  way. 

The  form  given  for  Am,£  in  Theorem  1  satisfy  (1)  -  (3)  if  J  G(x,a)mt(da) 
is  right  continuous  w.p.l.  Generally,  since  we  are  only  concerned  with  the  use 

A  _ 

of  Am,tq  in  an  integral  -  to  get  the  martingale  property  (4)  -  the  given  forms 
work  in  general.  Alternatively,  they  are  precisely  what  one  gets  via  the 
following  procedure.  Let  t  -  NA  for  integer  N  and  consider  the  following 
expression  for  {F£}-progressively  measurable  q(  )  with  sup  E|q(t)|  <  ® 

t 

N-l 

(5)  q(t)  -  q(0)  -  E  E£6[q(iA  +  A)  —  q(iA)]. 

i=0 

Suppose  that  there  is  a  progressively  measurable  g(  )  such  that  the  right  side 
of  (5)  converges  to  J‘g(s)ds  in  mean  as  A  -*  0,  for  each  t.  Then 


a 


1 1 


9 


a 


1 


a 


■3 


& 
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; 

ft 


q(t)  -  q(0)  -  g(s)ds 


is  a  zero  mean  {F£}-martingalc  and  we  write  q  €  D(Am  {)  and  g  =  Am,tq. 


i  4'  f  «'i  1*1  Vi  1 


5.  Convergence  of  Pathwisc  Discounted  Costs  to  the  Ergodic  Cost 

In  this  section,  we  treat  the  discounted  cost  result  (1.9).  Again,  the  exact 
sense  in  which  the  me(  )  are  6j-optimal  is  left  a  little  vague.  Since  u6(  )  is 
asymptotically  6-optimal,  no  matter  what  the  me()  are,  the  pathwisc  costs  are 
(for  small  6,0  no  better  (modulo  26)  than  the  costs  for  the  m€(),  with  an 
arbitrary  large  probability. 


Theorem  3. 


1  QL  2, 


(1.9)  hold. 


Remarks  on  the  Proof.  The  proof  is  essentially  the  same  as  those  of 
Theorems  1  or  2,  and  we  only  remark  on  the  differences.  We  use  the 
discounted  occupation  measures 

Pg,c(Bx  C)  -  6  f  c'eT  .  m.(C)dt, 

e  J0  (*€0)€b}  ‘ 

(5.1) 

P^(BxC)-6  f  e-B‘I{x(t)€B)mt(C)dt 

Jo 

and  analogously  for  the  feedback  control  cases. 

Then  the  cost  can  be  written  as 

Vg(m€)  *  |  k(x,a)Pg*  ,£(dxda). 

By  the  tightness  conditions  (A2.7),  (A2.8),  the  {Pg4'^-))  and  {P“S,e ( - )}  arc 


tight.  Define 


f|(t)  «  Be*e‘  f€(t). 


i 

I 

I 

,v 

t 
t 
1' 
.t 
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This  will  be  used  in  lieu  of  the  f£()  in  either  Theorems  1  or  2.  Wc  have 

(5.3)  =  -0VBtf|(t)  +  0c-B,Amf’£f|(t). 

Define  the  martingale 

f|(t)  -f|(0)  -  [‘  Am£,efg(s)ds 
=  0e*0tf£(t)  -  6f£(0)  -  [  [-0Ve,f£(s)  +  Be-6*Am€-€f£(s)]ds. 

As  in  Theorems  1  or  2 

(5.4)  0  =  Jim  0  [  e-e,Am£f(x£(s))ds. 

(6,t)-o  Jn 


0  =  lim 
(0,£)-o 


J  Aaf(x)Pg* 


:(dxda). 


Again  we  choose  weakly  convergent  subsequences  of  the  {Pg  •*(•)}  or 
6  , 

{Pfi  •*(  )}  and  continue  as  in  the  proofs  of  either  Theorems  1  or  2  to  get 
Theorem  3. 
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